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Abstract: Previous analysis on forecasting theory either assume knowing the 
true parameters or assume the stationarity of the series. Not much are known 
on the forecasting theory for nonstationary process with estimated parameters. 
This paper investigates the recursive least square forecast for stationary and 
nonstationary processes with unit roots. We first prove that the accumulated 
forecast mean square error can be decomposed into two components, one of 
which arises from estimation uncertainty and the other from the disturbance 
term. The former, of the order of log(T), is of second order importance to 
the latter term, of the order T. However, since the latter is common for all 
predictors, it is the former that determines the property of each predictor. 
Our theorem implies that the improvement of forecasting precision is of the 
order of log(T) when existence of unit root is properly detected and taken 
into account. Also, our theorem leads to a new proof of strong consistency of 
predictive least squares in model selection and a new test of unit root where 
no regression is needed. 

The simulation results confirm our theoretical findings. In addition, we find 
that while mis-specification of AR order and under-specification of the number 
of unit root have marginal impact on forecasting precision, over-specification 
of the number of unit root strongly deteriorates the quality of long term fore- 
cast. As for the empirical study using Taiwanese data, the results are mixed. 
Adaptive forecast and imposing unit root improve forecast precision for some 
cases but deteriorate forecasting precision for other cases. 



1. Introduction 

Forecasting future observations is one of the major purpose of building a time 
series model. Even for the purpose of time series controlling, forecasting provide 
the essential basis. For this purpose, autoregressive (AR) models are widely used 
for their simplicity. For an AR(p) process, 

(1) yt = Pivt-i + P2yt-2 ^ ^PpVt-p + e-t 

where 4>{z) = 1 — Piz — • • ■ — PpZ^ the characteristic polynomial determines the 
properties of the series, yt is called stationary or stable if all roots of (j) are outside 
the unit circle, unstable or nonstationary if some roots of (j) are on the unit circle 
and explosive if some roots of (j) are inside the unit circle. Previous analysis on 
forecasting theory either assume knowing true B'^ or only consider the stationary 
cases. For examples, Ing [1, 0] and Bhansali [J, 0] analyze the multistep prediction 
of stationary AR processes while Ing [3] derives the mean squares prediction errors 
of the least squares predictors in random walk model. Not much are known on 
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the forecasting theory for unstable process with estimated parameters. This paper 
investigates the recursive least square forecast for stable and unstable processes. 

Let yt be the forecast of yt based upon information up to t— 1. If one is interested 
in one-period forecast, (yt — ytY the cost to be minimized. However, there are two 
situations where the accumulated cost function, more appropriate. 

First, in the sequential forecast case, (see Goodwin and Sin [6]) the forecaster are 
updated sequentially over many periods and the accumulated cost function is the 
target to be minimized. Second, for a single realization of time series, the averaged 
accumulated cost function is often used as the yardstick to evaluate the out-of- 
sample forecasting performance of alternative forecasters. 

Ing [7] advocated adopting the accumulated cost function X^tLi E{Vt ~ Vt)^ over 
the one-period expected loss function E{yT+i—yT+i)'^- For an AR(1) process, these 
two quantities are respectively: 



t=3 ^ ^ 



,2 2 2(7^ /I 

EijjT+i-yT+i) =0- + — 

when true /3i = 1. In other words, the efficiency loss for not taking the unit root 
into consideration is greater for the accumulated cost function than the one-period 



cost function. See also Ing and Wei It is worth mentioning that Rissanen 



14| predictive least square (PLS) for model selection built upon accumulated cost 



function minimization. See also Wei [18 1. 



Under the assumption that E{el\Tt-i) = a.s. for all t, where Tt-i is the sigma 
field generated by {xs,s < t — 1}, then it can be shown that under appropriate 
assumptions that y X]t=i(yt ^ Vt)'^ — ^ '^'^ Chow 4], it is seen that 

T T 

^(yt - yt)^ = ^ Ct(1 + o(l)) a.s. on the set {Ct ^ oo} 
t=i t=i 

T T 

^{Vt-ytf =^(h+Ct{1 + 0{1)) a.s. on the set {Ct < oo} 



t=i t=i 



where 



Ct = ^{yt - yt- etf 
t=i 

While X]t=i larger in order than Ct, it is common for all forecasters and 
cannot be removed. Hence Ct becomes a more important quantity when evaluating 
the performance of alternative forecasters. 

Let Pt be the least square estimate of (3 

t -1 t 

k=l k=l 

where Yt — {yi, . . . , j/t}', then yt — f3't_iYt-i is the least square prediction of yt at 

time i — 1. 

Let 

(j>{z) = (z - 1)°(2 + - 2cos6lfeZ + if-niz) 
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where all roots of 7r(z) are all outside the unit circle. Wei 17| proves that 



(2) Ct ^ (p + + 6^ + 2 ^ dDa"^ log(T) in probabiUty. 

fc=i 

In other words, when 4'{z) has multiple unit roots the accumulated loss increase not 
linearly with the number of unit roots but at the rate of the square of the number 
of unit roots. 

In this paper, we prove that when 4>{z) has no complex roots, the convergence 
in ([2]) can be improved to be almost surely. This result could lead to a new proof 
of strong consistency of PLS in AR model selection. It is also conjectured that the 
result of almost surely convergence hold for the case of complex unit roots. We 
conduct several simulation experiments to assess the convergence result for various 
sample sizes. In addition, we also consider the impact of near unit root and model 
mis-specification on multi-step forecasting. Finally, we apply our methods to six real 
macroeconomic series in Taiwan. Forecasting performance of various forecasters and 
adaptive forecaster are investigated. 

The rest of the paper is organized as follows. The proof of the main theorem 
is put in Section 2. Section 3 illustrates implications and applications of our main 
theorem. Section 4 discusses multi-step and adaptive forecast. Monte Carlo results 
are reported in Section 5 and Section 6 summarizes the empirical results. Section 7 
concludes. 



2. Main theorem 



Assume that et are i.i.d. random variables with E{et) — and < E{ef) ~ < oo. 
Let Xt = {xt-u...,Xt-p)\ST = ELi^t and Tt = (-if ELi (-l)*^* = + 
(-I)Tt-i. 

Lemma 1. Assume that Xt+i — AXt + £t , where et — (et, 0, . . . , 0)' and the 

eigenvalues of A are all inside the unit circle. Then 

r Et=i ri 

hm = a.s. 

and 

hm , = a.s. 

Proof. It is known from Lai and Wei [12] [pages 363 and 364] that 

T 



(3) lim - ^ XtX't = S a.s 



t=i 



where E is a positive definite matrix, 

(4) limsup-f%4n =^ 

T^oo ?^^loglog(T) TT^ 
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and 

(5) liminf .f-*^!'^* =— a.s. 

T-+00 T2/loglog(T) 4 

Let denote the Euclidean norm of a fc-dimensional vector u = (ui, . . . 
i.e., — X^iLi By ([3]), ^^'^^^^ > a.s. and in tm-n we have that 

< X'Af2 XtX'r'Xr < ^2^^ = , - a.s. 

and 

T 

(6) X'rr{J2XtX't)'^XT -^0 a.s. 

t=i 

where Aniin(^) denotes the minimal eigenvalue of matrix A. 
Furthermore, by the law of iterative logarithm, 

limsup — — — — — = a.s. 
T_oo 2Tloglogr 

Hence (O implies that 

52 ^^f TloglogT 



(7) =0 



(loglogT)2 



Now, let 



T 

0(1) a.s. 



Then 



Zt — Zt-1 = Zt — - — —f ^7777 ^ t — i — y j ' 



(8) = .^^T 0(1), by® 



Xj'St 

T 1/2 

: 0(1)- 0(1), Since sup ||Zt|| < V ||Xt||2} a.s 

:0(1) 
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But, 

T 



Y,XtSt^Y.^AXt-i+et)St 

t=i t=i 

T T T T 

= AY, Xt-iSt-1 +AJ2 Xt-iet + J2 +Y.^t 
t=i t=i t=i t=i 

T T T 

t=i f=i f=i 

+ o((X: 5,^_i)^/^(logf: Ci)^) + 0(r) a.s. 
t=i t=i 

T 

= A(^X,_i5*_i) + o(Ti/2(iogr)^) 
t=l 

+ a((X:d)^/'(logT)^)+0(r) 



This implies that 

(9) Zt^AZt-i{1 + o{1))+o{1) a.s. 
Combining ([5]) and we have that 

(10) Zt-1 - AZt-1 = o{l) a.s. 
Therefore, any hmit point z of {z^} would satisfy 

(11) Z-AZ^O 

Since 1 is not an eigenvalue of A, ^ = 0. Using the same method one can prove 
that 

EtiX.T, ^^^^ 



This proves Lemma 1. □ 
Lemma 2. If E\ef\ < oo for some a > 2, then 

lim ^li^*^^ _ ^ a.s. 

Proof Note that TV = (-I)^Tt = ELi(-l)^£t- Using theorem 3.2 of Phillip in 
page 234 of Eberlein and Taqqu 0], (g]) and ^ hold if we replace 5"* by Tt. 
Therefore, 
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Let 



Ut 



Then 

(12) 
But 



Ut — ut-1 



StTt 



0(1) a.s. 



1) 



t=i 



^ StTt = J2^St-i + et)i-Tt^i + et) 
t=i 

T T T T 

= - ^ St-iTt-i + ^ St^iet - ^ Tt-ie* + e 

t=l t=l 4=1 t=l 

t=i t=i t=i 

T T 

+ o((^T2)i/2(iog(^T2))) + 0(T) a.s. 



t=i 



t=i 



Therefore, 



Ut 



Ef=Y St Tt ^ ^(i^sCL^) ^ o(i^l(£LSl ) 



(13) 



Z]t=i sf 



= -C/t-i(1 + o(1))+o(1) a.s. 
= —ut-i + 0(1) a.s. 



Combining (fT2|) and (fT3|) . since 

wt = 0(1) a.s. Ut — > a.s. 

Now, we are ready to state our main result. 
Let 

(14) 2/t = /3i2/f_i H hPpyt-p + et 

be an AR(p) model with 



(15) 
(16) 



(z)-l-/3iz PpzP 

= (l-z)(l + z)^'(z) 



□ 



where ^'(z) = 1 — ^E'lZ — • • ■ — 'igz'' is a polynomial of order q — p — 2 which has all 
roots outside the unit circle. 
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Theorem 1. Assume that the AR(p) model satisfies I116\) . If {et} is a se- 
quence of i.i.d. random variables with E\et\°' < oo, where a > 2, and ijq, . . . ,yi-p 
is independent of {et} then 

1 ^ 

(17) ^lim^-— logdet(^y,y;) = (p + 2) a.s. 

where y[ = {yt, . . .,yt-p+i). 

Proof. By Chan and Wei Q there exists a non-singular p x p matrix Q such that 
QVt ^ {ut,vt,x'^), where 

Xt = (Xt-l, ■ ■ ■ ,Xt-q)' , 

ut = Ut-1 + et, 

vt = -vt-i+et and 

Xt = '^iXt-l H h ^qXt-q. 

Therefore, if we let Zt = QUt, 

detip^yty't)^det[Q-^f:ztz[Q-^] ^ ^'g^^^^f*^ ■ 



To show PT|). it is sufficient to show 

T 

Inp'detl' 

T^oo logT 



1 

(18) ^lim^-— logdct(^2;t2;;) = (p + 2) a.s. 

Let 

/(ELi 

gt= iEli-rr'/' 

V T-V%^ 

where Iq is the q x q identity matrix. 
Then 

T / 1 ar b'rp\ 

GrY^ztz^GT^ Ut 1 , 

t=i \ bx ct Tt J 



where 



(Ef=l UtVt) 



Ct 



and 



[(Er=i"?)(Er=i«?)]^/^' 

ELi UtXt 

1 ^ 
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Let 



Iq-l^ 

Then A has all eigenvalues inside the unit circle and Xt = Axt-i + Therefore, 
there exist a non-singular matrix T such that 



lim Tt ~ r a.s. 



Furthermore, by Lemma 1 and 2, 



lim flT = 0, 

lim ct — a.s. 



Consequently, 



lim Gt^ Ztz'Gr = 1 O' 
^--^ ti VOOF 

Since F is nonsingular, (|18p is proved if 

T T 

logdet(G^2) = log(^ ul) + log(5] ) + q logT 
t=i t=i 

(19) -(p + 2)logT a.s. 

By dU and (O of Lemma 1, 

1 ^ 

lim - — — >^ u? = 2 a.s. 
T^oo ogT^ * 

Similar result holds for {vt\. Therefore, 

logdet(G'j;^) - (4 + (7)logr= (p + 2)logT a.s. 

This completes our proof. □ 

Remark 1. Theorem 3 of Wei [fH | shows that under similar assumptions as in our 
analysis, 

T 

(20) CT^a^\og<^et{Y,ytV't) a.s. 

t=i 

Thus, 

Ct - (p + 2)(j2 log(r) a.s. 

Remark 2. Theorem 1 and Remark 1 have an immediate implication for model 
selection and can greatly simplify the proof of Theorem 3.5 of Wei [l^. Let p* 
be known and po = max{j : /3j 7^ 0, 1 < j < p*} as in (jT]). Denote PLSt{p) ~ 
^"t=t ivt ~ VtY' where jjt is the forecast of yt based upon information up to t-1 
using^the kR{p) model as in Q and PLSt{pt) = mfiPLSrij) ■ < .] < P*}- Wei 
[lit showed that for both cases of underspecifying and overspecifying AR order (j), 
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^{PLStU) > PLSt{pq) eventually) — 1. Thus, P[pt = Po eventually] — 1. 
For the case of overspecification, Wei decomposed (f>p{z) into a sum of a unit root 
component and a stable component, and worked out the differnece of Ct between 
the true and the overspecified models. Our results can greatly simplify the proof. 
Let = J2t^=iiyt — yi''^ — etf where yp'' is the forecast oi yt at i — 1 using 
the AR(j) model. For the case of overspecification, (ij = 0,Vj > po- Applying 
Theorem 1 and Remark 1, {j+'2.)a^ log(r) > {pQ+2)a^ log(T) = C^^°^ a.s. 

As for the case of underspecification, / < po, the desired result, P[PLS't(0 > 
PLSt{pq) eventually] = 1, is a direct consequence of Theorem 3.2 of Wei [lit 
since /3p„ ^ 0. Thus, P[pt = Po eventually] = 1. 

3. Implications and applications of the main theorem 

We have just proved that for an AR(p) process, Ct — pcr^ log(r) if it is stationary 
and Ct = {p + l)<7^1og(T) if there is an root of 1. Our theorem implies that if 
the existence of unit root is properly detected and unit root constraint is imposed 
in forming the forecast, then Ct = {p — log(T). That is, for model with unit 
root, estimation is done for the differenced series rather than level of the series. 
By so doing, we reduce Ct by 2cr^ log(r) which could be substantial for large T 
and a^. However, it should be noted that J2t=i ivt ~ itf' severely affected 

by existence of unit root since Ct, which is of the order of log(T), is dominated 
by J which is of the order T. This result is natural since it is the long term 

forecast and not the short term forecast that unit root has strong impact. These 
findings are further confirmed in our simulation study in Section 5. 

In addition, our theorem implies that for AR(p) processes with root equal to or 
less than 1 in magnitude, as T — > oo, 

1 ^ 

(21) logdet— — (^y^y;) — >c a.s. 

' t=i 

where c = (p + 1) if there is a root of 1 and c — p \i all roots are less than one. 
Equivalently, 

T 1/2 

1 

(22) dT = [- log det ^ y^y'^ - p] — > d, a.s. 

t=i 

where d is 1 if there is a root of 1 and if there is no unit root. Note that if p is 
unknown but r > p is given, (j22p is still true with r replacing p in (|22p and in the 
definition of in (jl7p . In other words, our theorem proves that dT can be used as a 
test statistic for unit root. This issue will be further investigated in future research. 

4. Multi-step and adaptive forecast 

Our previous analysis focuses on 1-step forecast and there are cases when multiple- 
step forecast is the main concern. It is conjectured that our results can be extended 
to multi-step forecast but the issue will be pursued elsewhere. Instead, we shall 
concentrate our discussion on the relationship between model misspecification and 
adaptive forecast. 
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By (P), we have 

(23) ut+h = PiVt+h-i H 1~ PpVt+h-p + e.t+h 

and 

(24) ijt+h = Pijjt+h-i H h PpVt+h-p 

where yt+h-k — Ut for h < k. So, can be recursively solved in the order of 
yt+i, yt+2, • ■ ■ , yt+h- This is the conventional Box- Jenkins multi-step forecaster. 

Another way of generating the multi-step forecast is to solve the model that 
minimize^ the multi-step forecast error and then use it to form multi-step forecast 
(see Ing [8], Bhansali Weiss [13], and Tiao and Tsay [l3|). More specifically, the 
/i-step forecast error et{h) at time t is 

et{h) = et+h + *iet+/j-i H h '^h-iet+i 

where is defined by [l-^B PpB^]-^ = ^o + ^i-SH . The cost function 

to be minimized is 

T-h 

(25) C{h) = ^'{h) 

t=i 

Note that for different h different models are used and this explains the name 
'adaptive' forecast. Solving (j25|) involves nonlinear optimization as is a nonlinear 
function of (/3i, . . . , Pp). In practice, approximate linear model is used. That is, the 
following regression is performed 

yt = aiyt^h + a2yt-h-i -\ h apVt-h^p+i + h 

and 

yt+h = aiyt + a2yt-i H h apyt-p+i 

The idea behind the adaptive forecast is that if the model is misspecified, that 
is, p is mistakenly chosen, then this mistake will be amplified radically for the 
long term forecast. Adaptive forecast could avoid this compounding impact. It is 
reasonable to expect good performance of Box- Jenkins forecaster for the correctly 
specified model and good performance of adaptive forecaster for misspecified model. 

Ing, Lin and Yu [lOj] propose a predictor selection criterion to choose the best 
combination of prediction models (AR lags) and prediction methods (adaptive or 
plug- in). When there is only one unit root, the proposed method is proved to be 
asymptotically efficient in the sense that the predictor converges with probability 
one to the optimal predictor which has minimal loss function. 

5. Monte Carlo experiments 

To assess the theoretical results obtained in previous section and acquire experience 
about empirical analysis in the sequel, we conduct two Monte Carol experiments. 
The first is to investigate the finite sample properties of Ct in theorem 1 and the 
second on forecast comparison between alternative forecasters. For both cases, we 
generate data from the following four models: 
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. Model l:(l-0.5B)2(l-B)yt = et or yt = 2yt-i-l.2^yt-2+0.2^Vt-s+et. 
Roots are 0.5, 0.5 and 1.0 respectively. 

• Model 2: (1 - {).bBf{\ - .99B)yt = £< or yt = l.99yt^i - 1.24yt_2 + 
0.2475yt_3 + e* 

Roots are 0.5, 0.5 and 0.99 respectively. 

• Model 3: (1 - 0.55)2(1 - .95B)yt = or yt = 1.95?;t_i - 1.2yt_2 + 
0.2375yt_3 + et 

Roots are 0.5, 0.5 and 0.95 respectively. 

• Model 4: (1 - 0.5B)^yt = et or yt = l.byt-i - 0.75yt-2 + 0.125yt_3 + e* 
All roots are 0.5. 

is set to be 1 for all models. 



5.1. Monte Carlo experiment on Ct 

The number of replications are 1000 for each experiment. For each, realization, 10 
sets of samples are drawn from each model with sample size, T, varying from 100, 
200 to 1000. For each sample, starting from t = to{=10), the model parameters are 
estimated and is then used to forecast t+1. Then we reestimate the model using 
sample from 1 to t + 1 and forecast t + 2. The process is repeated until when T — 1 
sample is used to estimate the model and then used to forecast yr- The forecast 
mean square error is then summed from to + 1 to T to obtain Ct- Finally, we 
compute the averaged Ct obtained from 1000 replications. In other words, 



(26) Cj 



i^t=to[yi,t+i - yi,t+i) 

(1000)(T-to) 



In addition, for each model, we repeat the procedure above with the constraint 
that one of the root is equal to one. The results are summarized in Table [TJ As one 
can easily see, over 40 millions regressions have to performed to obtain this table 
and usage of updating formula can significantly reduce the computation burden. 
In Table [U the first column is sample size. Results for first model with unit root 
{d = 0) and 1 unit root (d = 1) are put in second and third columns. Results for 
the other three models are put in columns 4 to 9. Our theory predicts that: (1) 



Table 1 
Ct for simulated data 

Roots are 



0.5,0.5,1.0 0.5,0.5,0.99 0.5,0.5,0.95 0.5,0.5,0.5 



T 


d = 


d=l 


d = 


d = l 


d = 


d=l 


d = 


d=l 


100 


23.47 


12.33 


23.47 


12.33 


23.80 


15.36 


21.07 


23.22 


200 


27.55 


14.71 


27.55 


14.71 


27.71 


20.19 


24.28 


37.60 


300 


29.90 


16.06 


29.90 


16.06 


29.83 


23.90 


26.09 


50.86 


400 


31.49 


17.00 


31.49 


17.00 


31.21 


27.17 


27.32 


63.57 


500 


32.75 


17.73 


32.75 


17.73 


32.26 


30.27 


28.29 


75.96 


600 


33.76 


18.30 


33.76 


18.30 


33.09 


33.12 


29.04 


88.12 


700 


34.62 


18.79 


34.62 


18.79 


33.80 


36.01 


29.69 


100.41 


800 


35.38 


19.22 


35.38 


19.22 


34.40 


38.89 


30.26 


112.72 


900 


35.99 


19.60 


35.99 


19.60 


34.94 


41.65 


30.76 


124.80 


1000 


36.55 


19.94 


36.55 


19.94 


35.42 


44.37 


31.21 


136.93 


/3 


5.2849 


2.8583 


5.2849 


2.8583 


5.1947 


5.2064 


4.5635 


13.8536 


i?2 


0.9988 


0.9902 


0.9988 


0.9902 


0.9920 


0.6315 


0.9930 


0.4471 
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Table 2 
MSB for simulated Data 

Roots are 



0.5,0.5,1.0 0.5,0.5,0.99 0.5,0.5,0.95 0.5,0.5,0.5 



T 


d = 


d=l 


d = 


d = l 


d = 


d = l 


d = 


d = l 


100 


117.81 


106.92 


117.81 


106.92 


118.33 


110.06 


115.59 


118.17 


200 


227.10 


214.61 


227.10 


214.61 


227.36 


220.24 


224.11 


237.66 


300 


334.88 


321.53 


334.88 


321.53 


334.97 


329.57 


331.53 


356.59 


400 


441.30 


427.33 


441.30 


427.33 


441.27 


437.73 


437.66 


474.10 


500 


547.43 


533.00 


547.43 


533.00 


547.19 


545.69 


543.55 


591.33 


600 


653.42 


638.61 


653.42 


638.61 


653.01 


653.67 


649.36 


708.93 


700 


759.51 


744.24 


759.51 


744.24 


758.92 


761.63 


755.18 


826.46 


800 


865.20 


849.45 


865.20 


849.45 


864.35 


869.13 


860.55 


943.18 


900 


970.92 


954.95 


970.92 


954.95 


970.01 


976.96 


966.16 


1060.21 


1000 


1076.76 


1060.56 


1076.76 


1060.56 


1075.72 


1084.89 


1071.91 


1177.63 



Ct increases linearly with log(r — to) and (2) Ct without unit root constraint is 2 
times Ct with unit root constraint. 

We run a simple regression of Ct against log(r — to) without intercept for each 
model and report the regression coefficients and in the last row of Table [TJ 
For column 2 and 3 of the table, the regression coefficients are 5.2849 and 2.8583 
respectively while are greater than 0.99 for both cases. In summary, model 1 
conforms the theoretical results. 

As for model 2, one of the root is 0.99. Since it is the 1-step that is the main 
concern here, the result is almost the same with model 1. This is consistent with 
the findings of Lin and Tsay 13] that unit root or not does not matter much for 
short term forecast. 

For model 3, the largest root is 0.95 which is not close to 1 enough. Imposing 
unit root constraint produces much larger Ct and the stable relationship between 
Ct and log{T) deteriorates greatly as is seen from poor R^. This can be justified 
by the fact that differencing a stationary process produce a unit root in the MA 
component which can not be approximated by high order AR. The situation become 
much worse for model 4 where all roots are equal to 0.5. 

For the purpose of comparison, we also report the corresponding conventional 
MSE (Y^J^iiyt — VtY) for the same 4 models above in Tabled We observed from 
the table that contrary to the case for Ct, the MSE for d = is about the same 
as for d = 1. This confirms our previous analysis that Ct, though an important 
quantity for determining the quality of forecast, is of second order importance as 
compared to Ylit=tQ+i ^t- For 1-step forecast the distinction between unit root and 
near unit root does not matter much. 

5.2. Monte Carlo experiment on short-term and long-term forecast 
comparison 

This simulation is designed to evaluate the short-term and long-term forecasting 
performance of alternative forecasters. The number of replications are again 1000. 
For each replication, 400 observations are generated from the four models above. 
The first 300 observations are reserved for estimation and then used to produce 1 to 
60 steps forecast. Next, the model are re-estimated using the first 301 observations 
and then used to forecast 1 to 60 steps ahead. The procedure is repeated until 
when the first 399 observations is used for estimation and the last 1-step ahead 
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forecast is formed. So, we have 100 1-step forecasts, 99 2-step forecasts and 40 
60-step forecasts. Then, we compute root mean square error (RMSE) for forecast 
of each step. FinaUy, the resulting RMSE is averaged over 1000 rephcations. More 
speciflcahy, letting ei,t(fc) be the k period ahead forecast error at time t of the j-th 
replication. Then 



(27) RMSE(^) = E{i) 



(1000)(100-^+l) 



The simulation results are put in Tables [3] to [6] In each table, column 1 is steps of 
forecast, column 2 is the RMSE for model with p = 3 and d — 0, serving as the 
benchmark for forecast comparison. Columns 3 to 7 are E(£) ratios of model with 
various p and d to column 2. 

From these tables we observe the following. First, for stationary processes, the 
E{£) for the correctly model converges to a constant with the rate of convergence 
depending upon the value of the root. For root of 0.5, the E{£) approach a constant 
as early as £ = 6 while for root of 0.95 i does not stabilize until 30. As for root 
of .99, it is so close to 1 and E{£) is still increasing after £ = 60. For process with 
unit root E{£) increases with £ for all the whole range of £. Second, the true model 
outperforms other misspecified models in forecasting. Third, over-specification of 
unit results in poor forecast. For the case of model 4 (Table[6|) E{£) for d = 1 is 5% 
higher than d = and jumps to more than 50% for £ greater than 40. For model 
3, one of the root is 0.95 and the forecaster for d = 1 is still 45% worse than d = 
though a little better than model 4. As for model 2, one of the root is 0.99 and 
for up to 20 steps, d = 1 fares as well as d = and is only 10% worse than the 
true model at 60-step forecast. Fourth, under-specification of unit root only results 
in small increase of E{£). From column 2 of Table [31 the inefficiency is less than 
4% from 1-step to 60-step forecasts. Fifth, under- or over-specification of AR order 



Table 3 

Forecasting comparison for simulated data: true p = 3, roots are 0.5, 0.5, 1.0 



Steps 




E{e) ratio of MSE to model with p = 3, d 


= 


i: 


p = 3 


p = 3 


p = 2 


p = 2 


p = A 


p = 4 




d = 


d = l 


d = 


d=l 


d = 


d=l 


1 


3.26 


99.71 


103.25 


102.98 


100.15 


99.86 


2 


7.34 


99.49 


102.73 


102.23 


100.15 


99.64 


3 


11.69 


99.27 


102.52 


101.78 


100.15 


99.41 


4 


15.92 


99.04 


102.58 


101.56 


100.14 


99.18 


5 


19.89 


98.80 


102.77 


101.44 


100.14 


98.94 


6 


23.57 


98.57 


103.01 


101.32 


100.14 


98.69 


7 


26.95 


98.34 


103.26 


101.18 


100.14 


98.46 


8 


30.08 


98.12 


103.51 


101.01 


100.15 


98.24 


9 


33.00 


97.91 


103.76 


100.81 


100.15 


98.03 


10 


35.72 


97.72 


104.01 


100.59 


100.15 


97.83 


15 


47.47 


97.06 


105.09 


99.56 


100.14 


97.12 


20 


57.04 


96.73 


105.83 


98.81 


100.15 


96.77 


25 


65.29 


96.65 


106.17 


98.39 


100.14 


96.68 


30 


72.70 


96.61 


106.24 


98.05 


100.12 


96.63 


35 


79.58 


96.67 


106.08 


97.95 


100.09 


96.69 


40 


85.99 


96.76 


105.70 


97.81 


100.05 


96.78 


45 


92.08 


96.92 


105.24 


97.76 


100.01 


96.94 


50 


98.03 


97.12 


104.66 


97.86 


99.98 


97.14 


55 


103.82 


97.21 


104.05 


97.97 


99.94 


97.23 


60 


109.21 


97.26 


103.43 


97.96 


99.87 


97.30 
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Table 4 

Forecasting comparison for simulate model: true p = 3, roots are 0.5, 0.5, 0.99 



Steps 


E{t) 


E{() ratio of MSE to model with p = 3, d 


= 


t: 


p — 6 


p = 3 


p = 2 


p = 2 


p = 4 


p = 4 




d = \J 


d = 1 


d = 


d = 1 


d = 


-J 1 
d = L 


1 


3.26 


99.91 


103.18 


103.27 


100.15 


100.06 


2 


7.32 


99.85 


102.70 


102.74 


100.15 


99.99 


3 


11.61 


99.79 


102.53 


102.51 


100.15 


99.93 


4 


15.76 


99.74 


102.61 


102.54 


100.14 


99.87 


5 


19.62 


99.70 


102.81 


102.67 


100.14 


99.82 


6 


23.14 


99.66 


103.06 


102.82 


100.15 


99.77 


7 


26.36 


99.63 


103.31 


102.95 


100.15 


99.75 


8 


29.30 


99.62 


103.57 


103.04 


100.17 


99.73 


9 


32.00 


99.63 


103.82 


103.10 


100.18 


99.72 


10 


34.50 


99.65 


104.06 


103.14 


100.18 


99.74 


15 


44.87 


100.10 


105.10 


103.38 


100.22 


100.13 


20 


52.75 


100.98 


105.62 


103.95 


100.26 


100.98 


25 


59.01 


102.16 


105.59 


104.90 


100.27 


102.14 


30 


64.21 


103.31 


105.24 


105.83 


100.25 


103.28 


35 


68.72 


104.51 


104.66 


106.92 


100.21 


104.47 


40 


72.71 


105.65 


103.95 


107.84 


100.15 


105.60 


45 


76.32 


106.85 


103.14 


108.83 


100.11 


106.80 


50 


79.68 


108.11 


102.32 


110.03 


100.07 


108.06 


55 


82.76 


109.27 


101.56 


111.27 


100.03 


109.22 


60 


85.54 


110.20 


100.83 


112.15 


99.98 


110.15 



Table 5 

Forecasting comparison for simulated mod,el: true p = 3. roots are 0.5, 0.5, 0.95 



Steps 


E{e) 


rai 


io of AISE 


lo liiodci 


wiiii /; = .'-), ( 


/ = 


e-. 


p = 3 


p = 3 


p = 2 


p = 2 


p = 4 


p = 4 




d = 


d = l 


d = 


d=l 


d = 


d=l 


1 


3.26 


100.87 


102.92 


104.61 


100.16 


101.00 


2 


7.19 


101.59 


102.50 


105.05 


100.17 


101.70 


3 


11.20 


102.35 


102.39 


105.87 


100.17 


102.43 


4 


14.93 


103.15 


102.50 


107.02 


100.18 


103.22 


5 


18.24 


104.01 


102.69 


108..34 


100.19 


104.06 


6 


21.11 


104.92 


102.91 


109.71 


100.20 


104.94 


7 


23.59 


105.87 


103.11 


111.09 


100.22 


105.86 


8 


25.72 


106.85 


103.29 


112.41 


100.24 


106.81 


9 


27.57 


107.85 


103.46 


113.70 


100.26 


107.79 


10 


29.17 


108.88 


103.60 


114.95 


100.28 


108.79 


15 


34.68 


114.26 


103.91 


120.95 


100.35 


114.08 


20 


37.59 


119.71 


103.34 


126.69 


100.37 


119.46 


25 


39.16 


124.73 


102.28 


132.01 


100.32 


124.42 


30 


40.04 


128.94 


101.30 


136.47 


100.26 


128.60 


35 


40.61 


132.40 


100.65 


140.17 


100.19 


132.01 


40 


41.02 


135.16 


100.27 


142.89 


100.14 


134.75 


45 


41.31 


138.01 


99.99 


145.70 


100.11 


137.58 


50 


41.50 


140.97 


99.79 


148.84 


100.08 


140.50 


55 


41.61 


143.76 


99.68 


152.10 


100.06 


143.26 


60 


41.68 


145.57 


99.66 


154.16 


100.04 


145.04 
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Table 6 

Forecasting comparison for simulated data: true p = 3, roots are all 0.5 



Steps 


E(t) 


E(£) ratio of MSE to model with p = 3, d 


= 


t. 


P — o 


n — 3 
P — -J 


T) — 2 


T) — 2 
p — z 


p — ^ 


n — A 




d = 


d = 1 


d = 


d = 1 


d = 


d=l 


1 


3.26 


105.23 


100.68 


108.65 


100 13 


104.82 


2 


5.90 


110.01 


100.58 


114.55 


100.13 


109.11 


3 


7.70 


115.20 


100.65 


121.43 


100.13 


113.78 


4 


8.74 


120.67 


100.72 


128.63 


100.13 


118.68 


5 


9.28 


126.00 


100.77 


135.33 


100 14 


123.50 


6 


9.53 


lOU. 1 o 


1 on 


1 AO QQ 


100 14 


127.93 


7 


9.64 


134.52 


100.74 


145.38 


100.14 


131.62 


8 


9.68 


1 QV Q1 
io / .Oi 


1 nn RA 

iUU.D4t 




1 nn 1 

lUU. Lo 


134.39 


9 


9.69 


139.30 


100.49 


150.91 


100.12 


136.32 


10 


9.69 


140.71 


100.32 


152.55 


100.10 


137.64 


15 


9.68 


144.26 


99.98 


156.66 


100.04 


140.87 


20 


9.68 


145.69 


99.98 


158.33 


100.02 


142.22 


25 


9.66 


147.22 


99.98 


160.24 


100.00 


143.62 


30 


9.66 


148.04 


99.98 


161.31 


100.00 


144.37 


35 


9.67 


148.70 


99.98 


162.52 


100.00 


144.86 


40 


9.68 


148.98 


99.99 


162.92 


100.00 


145.11 


45 


9.68 


150.37 


99.99 


164.35 


100.00 


146.48 


50 


9.69 


153.41 


99.98 


167.67 


99.99 


149.35 


55 


9.68 


155.18 


99.99 


169.99 


99.99 


150.98 


60 


9.66 


155.33 


99.99 


170.78 


99.99 


150.92 



only affects the forecast precision marginally. The E{£) for all models are within 
6% to the true model for all forecasts up to 60-step ahead. 

To sum up, the simulation show that slight misspecification of AR order and 
under specification of unit root are not serious in forecasting but over-specification 
of unit root could result in poor forecast when the root of characteristic polynomial 
is far from 1. Yet, improvement of forecasting precision in absolute term could be 
substantial for large sample when the existence of unit root is appropriately taken 
into consideration. 

6. Empirical results 

6.1. Data 

For empirical analysis, we analyze 6 most frequently used data sets in Taiwan in- 
cluding Gross Domestic Product (GDP), Consumer Price Indices (CPI), Wholesale 
Price Indices (WPI), Interest Rates ( IR), Exchange Rate of New Taiwan Dollar to 
US Dollar(RX) and money supply (MIB). All series are quarterly data taken from 
the AREMOS databank. The sample period is 1961:1 to 1995:4 except for MIB 
which ranges between 1961:3 to 1995:4. So, sample size is 138 for MIB and 140 for 
the rest series. All series are seasonally unadjusted. 

6.2. Order selection 

Selecting lag order p and forecasting method simultaneously is analyzed in Ing, 
Lin and Yu [31. Here, we follow the conventional wisdom by using AIC and chi- 
square statistics to determine p. When the AIC has a clear minimal, we select the 
order corresponding to the minimal AIC. When AIC is decreasing without a clear 
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minimum, we use chi-square statistics to select the last significant lag. It turns out 
that CPI, WPI and RX have order 2, interest rate has order 6, MIB has order 3 
and GDP has order 8. The high order indicates the possible existence of seasonal 
unit root which is not investigated here. 

6.3. Forecasting procedure 

For each series, the first 100 observations are reserved for estimation and 1- to 
20-step forecasts are computed. Then the model are re-estimated using first 101 
observations and another 1- to 20-step forecasts are computed. The procedure is 
repeated until when the first T — 1 observations are used to estimate the model 
and the last 1-step forecast is computed. Hence, we have 40 1-step forecasts, 39 
2-step forecasts and 20 20-step forecasts except for MIB where there are 38 1-step 
forecasts and 18 20-step forecasts. For each step, the average root mean square error 
is computed. 

6.4. Results 

The results are reported in Tables 171 to [T2l From the tables we observe the following. 
First, E{£) increases linearly with £ for all series except for Interest Rates. This 
seems to suggest that except IR, all variables have a unit root. Second, regarding 
the Box- Jenkins forecast, imposing unit root constraint result in poor forecast for 
all steps ahead for WPI, CPI, GDP and IR. Especially for IR, the RMSE for d = 1 
is 200% higher than that for d = 0. This seems to be consistent with the finding that 
its E{£) converges to a constant very quickly. However, for RX forecast with d—1 
fares much better than forecast with d = 0. The precision gain from imposing unit 
root is about 5% for 1-step forecast and then up to over 30% for 20-step forecast. 
This seems to indirectly support the efficient market hypothesis for the foreign 

Table 7 
Forecasting comparison for GDP 



E{i) ratio to model BJ, d = 



I 




BJ, d = 1 


Adap, d = 


Adap, d = 1 


1 


12258.04 


101.94 


100.00 


99.22 


2 


18024.46 


104.69 


101.47 


179.52 


3 


21232.71 


106.87 


115.10 


151.59 


4 


24719.28 


108.70 


106.82 


83.76 


5 


31938.87 


110.92 


102.59 


99.66 


6 


37316.81 


112.05 


116.19 


125.65 


7 


40063.38 


112.71 


132.45 


98.06 


8 


40505.82 


109.83 


147.52 


45.57 


9 


46605.77 


111.94 


158.02 


65.51 


10 


52966.58 


116.89 


159.53 


103.16 


11 


57551.40 


121.35 


157.04 


91.13 


12 


59480.32 


120.18 


169.46 


66.06 


13 


66651.35 


120.95 


181.21 


85.92 


14 


74760.98 


123.52 


184.47 


109.41 


15 


79555.22 


124.66 


174.53 


94.18 


16 


81162.26 


120.64 


173.44 


74.78 


17 


91194.77 


117.99 


185.95 


61.83 


18 


99975.45 


122.09 


181.84 


99.95 


19 


105256.88 


125.83 


162.48 


83.94 


20 


108809.60 


122.63 


162.14 


55.65 
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Table 8 
Forecasting comparison for CPI 



steps 

t 




E{(.) ratio to model BJ, d 


= 


BJ, d = 1 


Adap, d = 


Adap, d = \ 


1 


1.12 


99.39 


100.00 


101.46 


2 


1 '71 

i.Yi 


98.81 


83.53 


77.07 


3 


1.82 


99.20 


100.10 


69.33 


4 


1.83 


99.75 


123.42 


79.90 


5 


2.09 


98.49 


128.11 


84.27 


6 


2.51 


99.73 


124.42 


81.85 


7 


2.54 


101.83 


118.32 


89.80 


8 


2.65 


102.66 


165.72 


95.17 


9 


2.97 


102.13 


163.20 


95.48 


10 


3.19 


101.81 


171.06 


95.60 


11 


3.06 


106.53 


209.48 


98.15 


12 


3.12 


108.45 


237.77 


96.83 


13 


3.55 


106.61 


236.77 


86.76 


14 


3.71 


107.33 


257.23 


89.74 


15 


3.76 


111.25 


289.99 


98.66 


16 


3.83 


113.36 


326.41 


99.02 


17 


4.32 


110.28 


336.41 


91.27 


18 


4.50 


111.09 


372.91 


89.71 


19 


4.44 


113.86 


431.77 


86.81 


20 


4.79 


111.22 


455.83 


80.87 



Table 9 
Forecasting comparison for WPI 



E{i) ratio to model BJ, d = 





E{t) 


BJ, d = 1 


Adap, d = 


Adap, d = 1 


1 


1.16 


102.47 


100.00 


101.19 


2 


2.13 


103.43 


58.78 


93.97 


3 


3.05 


104.. 54 


50.13 


97.35 


4 


3.88 


105.55 


47.53 


96.59 


5 


4.56 


107.43 


49.39 


108.12 


6 


5.07 


109.64 


53.84 


113.24 


7 


5.41 


112.38 


61.27 


113.87 


8 


5.58 


116.54 


69.88 


120.83 


9 


5.89 


120.00 


76.71 


132.83 


10 


6.36 


121.66 


81.36 


140.82 


11 


6.93 


122.71 


86.87 


137.69 


12 


7.53 


123.95 


90.96 


135.62 


13 


8.02 


125.92 


96.49 


141.24 


14 


8.48 


127.93 


103.98 


150.77 


15 


8.82 


130.43 


113.97 


157.73 


16 


8.87 


135.06 


127.60 


157.82 


17 


8.97 


139.02 


141.47 


173.71 


18 


9.15 


142.11 


157.87 


191.82 


19 


9.52 


143.65 


174.20 


193.95 


20 


10.19 


142.43 


186.53 


178.22 
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Table 10 
Forecasting comparison for RX 



E{i) ratio to model BJ, d = 



e 


m 


BJ, d = 1 


Adap, d = 


Adap, d = 1 


1 


.66 


95.38 


100.00 


99.40 


2 


1.32 


92.38 


53.40 


90.39 


3 


2.01 


89.53 


40.91 


81.93 


4 


2.77 


88.16 


37.25 


75.67 


5 


3.41 


87.09 


38.60 


73.91 


6 


3.99 


86.26 


42.45 


72.13 


7 


4.42 


85.01 


47.25 


72.20 


8 


4.71 


83.30 


54.59 


70.22 


9 


5.03 


82.30 


61.77 


68.81 


10 


5.29 


81.93 


68.35 


72.37 


11 


5.58 


81.90 


74.77 


75.55 


12 


5.94 


81.98 


79.20 


77.42 


13 


6.31 


81.57 


81.79 


76.30 


14 


6.67 


80.44 


83.67 


74.67 


15 


6.93 


78.33 


85.17 


74.42 


16 


7.11 


75.67 


86.40 


70.66 


17 


7.34 


72.78 


86.34 


62.28 


18 


7.56 


70.48 


86.23 


58.96 


19 


7.87 


69.58 


85.50 


59.46 


20 


8.24 


69.51 


83.95 


51.02 



Table 11 
Forecasting comparison for MIB 



E(i) ratio to model BJ, d = 



e 


m 


BJ, d = 1 


Adap, d = 


Adap, d = 1 


1 


96297.39 


92.81 


100.00 


102.13 


2 


152389.01 


94.00 


65.10 


86.82 


3 


208.305.92 


94.22 


52.25 


78.01 


4 


266876.68 


94.73 


52.56 


71.28 


5 


377584.71 


92.59 


55.08 


61.12 


6 


481271.98 


94.39 


61.18 


56.24 


7 


532886.35 


99.13 


59.46 


55.99 


8 


595646.12 


101.27 


71.45 


49.52 


9 


668073.88 


108.19 


84.69 


50.29 


10 


774390.51 


113.50 


92.29 


49.31 


11 


821482.40 


123.40 


89.06 


50.77 


12 


886619.83 


129.37 


90.62 


46.46 


13 


1052170.67 


129.01 


89.81 


42.85 


14 


1158059.45 


143.04 


86.93 


44.22 


15 


1335812.44 


145.93 


70.46 


44.98 


16 


1378939.42 


165.55 


60.58 


42.82 


17 


1649465.49 


162.79 


56.17 


46.78 


18 


1748042.18 


183.13 


61.66 


45.31 


19 


1893323.08 


200.50 


51.17 


41.98 


20 


2079603.50 


214.50 


50.45 


28.34 
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Table 12 
Forecasting comparison for IR 



E{£) ratio to model BJ, d = 



I 


m 


BJ, d = 1 


Adap, d = 


Adap, d = I 


1 


0.72 


104.58 


100.00 


106.60 


2 


1.16 


108.68 


67.94 


79.35 


3 


1.24 


114.84 


64.70 


108.65 


4 


1.38 


120.28 


58.85 


127.00 


5 


1.63 


124.48 


48.70 


125.50 


6 


1.82 


130.14 


46.05 


120.60 


7 


1.83 


138.25 


50.44 


124.52 


8 


1.82 


146.64 


55.74 


139.96 


9 


1.83 


154.97 


58.67 


170.56 


10 


1.83 


164.17 


59.64 


191.38 


11 


1.76 


175.17 


60.49 


208.55 


12 


1.70 


185.17 


63.08 


211.01 


13 


1.67 


194.23 


65.27 


208.26 


14 


1.61 


204.93 


75.07 


213.25 


15 


1.52 


216.39 


91.16 


219.55 


16 


1.36 


237.94 


94.70 


238.53 


17 


1.27 


254.73 


108.89 


239.50 


18 


1.10 


289.28 


101.44 


237.00 


19 


0.84 


370.74 


131.31 


269.90 


20 


0.59 


521.61 


199.32 


349.14 



exchange market in Taiwan. As for MIB, imposing unit root constraint improves 
forecast precision from 1- to 7-step forecasts but deteriorates forecast precision from 
8-step to 20-step forecasts. The inefficiency is more than 100% for 19 and 20-step 
forecasts. Third, the performance of adaptive forecaster is mixed. For RX and MIB, 
adaptive forecast with d = and d = 1 consistently outperforms conventional Box- 
Jenkins' forecast by a large margin. The precision gain could go as high as 50%. 
For CPI adaptive forecast performs poorly for d = but very well for = 1. For 
IR and WPI adaptive forecast with d = performs well in short and medium term 
forecast but fares poorly in long term forecast. But adaptive forecast with d = 1 
performs okay in the short term but very poorly in the long term. The case GDP 
is quite interesting. While adaptive forecast with d = fares poorly for short and 
long term forecast, the performance of adaptive forecast with d~l jumps up and 
down across steps. This seems to suggest that seasonality plays an important for the 
differenced GDP which is supported by the corresponding autocorrelation function. 
This issue will be investigated in future study. 

To sum up, the empirical findings are mixed. Imposing unit root constraint 
might improve forecast precision for some cases but deteriorate forecast precision in 
others. Also, adaptive forecast differs from Box-Jenkins' forecast by the big margin. 
Most frequently, it could improve short to medium term forecast but result in poor 
long term forecast. However, for some cases, it could produce either better or worse 
forecast for forecast of all steps. Further study is needed to determine the influencing 
factors. 

7. Conclusions 

We have analyzed the least square forecaster from various aspects. From the theo- 
retical viewpoint, we prove that Ct, the most important quantity when evaluating 
the performance of 1-step forecasters is equal to {p + d)a^ log(T) where is 1 or 
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depending if there is a unit root. This result could be used to analyze the gain in 
forecasting precision when unit root is detected and is taken into account. Further, 
this theorem can lead to a simple proof of the strong consistency of PLS in AR 
model selection and a new test of unit root. 

Our simulation analysis confirms the theoretical results. In addition, we also 
learn that while mis-specification of AR order has marginal impact on forecasting 
precision over-specification of unit root strongly deteriorate the quality of long 
term forecast. As for the empirical study using Taiwanese data, the result is mixed. 
Adaptive forecast and imposing unit root improves forecast precision for some cases 
but deteriorates forecasting precision for other cases. 
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